speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Mići Princ -- A Little Boy Teaching Speech Technologies the Chakavian Dialect

Add code
Feb 03, 2026
Viaarxiv icon

Mixture-of-Experts with Intermediate CTC Supervision for Accented Speech Recognition

Add code
Feb 02, 2026
Viaarxiv icon

BBPE16: UTF-16-based byte-level byte-pair encoding for improved multilingual speech recognition

Add code
Feb 02, 2026
Viaarxiv icon

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Add code
Feb 02, 2026
Viaarxiv icon

WAXAL: A Large-Scale Multilingual African Language Speech Corpus

Add code
Feb 02, 2026
Viaarxiv icon

Semantics-Aware Generative Latent Data Augmentation for Learning in Low-Resource Domains

Add code
Feb 02, 2026
Viaarxiv icon

Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages

Add code
Feb 01, 2026
Viaarxiv icon

EmoAra: Emotion-Preserving English Speech Transcription and Cross-Lingual Translation with Arabic Text-to-Speech

Add code
Feb 01, 2026
Viaarxiv icon

MedSpeak: A Knowledge Graph-Aided ASR Error Correction Framework for Spoken Medical QA

Add code
Feb 01, 2026
Viaarxiv icon

Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization

Add code
Jan 30, 2026
Viaarxiv icon